Transition to Self-compatibility Associated With Dominant S-allele in a Diploid Siberian Progenitor of Allotetraploid Arabidopsis kamchatica Revealed by Arabidopsis lyrata Genomes

Abstract A transition to selfing can be beneficial when mating partners are scarce, for example, due to ploidy changes or at species range edges. Here, we explain how self-compatibility evolved in diploid Siberian Arabidopsis lyrata, and how it contributed to the establishment of allotetraploid Arabidopsis kamchatica. First, we provide chromosome-level genome assemblies for two self-fertilizing diploid A. lyrata accessions, one from North America and one from Siberia, including a fully assembled S-locus for the latter. We then propose a sequence of events leading to the loss of self-incompatibility in Siberian A. lyrata, date this independent transition to ∼90 Kya, and infer evolutionary relationships between Siberian and North American A. lyrata, showing an independent transition to selfing in Siberia. Finally, we provide evidence that this selfing Siberian A. lyrata lineage contributed to the formation of the allotetraploid A. kamchatica and propose that the selfing of the latter is mediated by the loss-of-function mutation in a dominant S-allele inherited from A. lyrata.


Introduction
Most angiosperms are hermaphroditic, with bisexual flowers producing both female and male gametes, and can thus potentially self-fertilize. Diverse self-recognition systems based on pollen-pistil interactions evolved repeatedly Zhao et al. 2022), preventing inbreeding, and subsequently, several independent transitions from outcrossing to self-pollination have occurred through degradation of these recognition systems (Shimizu and Tsuchimatsu 2015). A transition to selfing provides an immediate advantage in the face of low population density, often at the edges of the species distribution (Levin 2012). Pinpointing the genetic changes undermining self-rejection in nature not only improves our understanding of self-incompatibility mechanisms but also provides a more complete evolutionary history of the self-compatible species, providing essential context to understand their genome evolution (Guo et al. 2009;Slotte et al. 2013;Vekemans et al. 2014;Durvasula et al. 2017;Mable et al. 2017;Fulgione et al. 2018;Mattila et al. 2020).
In Brassicaceae, the sporophytic self-incompatibility (SI) system involves a self-pollen recognition mechanism determined by the S-locus, where two main genes are linked: The male SCR gene is expressed in tapetum cells of anthers, the protein is embedded into the pollen coat and serves as a ligand for the receptor kinase coded by the female SRK gene, which is expressed on the surface of the stigma (Stein et al. 1991;Schopfer et al. 1999;Takayama et al. 2000Takayama et al. , 2001Takayama and Isogai 2005;Nasrallah 2019). A breakdown of SI and transition to self-compatibility occurs when recognition between SCR and SRK (or downstream signaling) leading to pollen rejection is impaired (Uyenoyama et al. 2001;Shimizu and Tsuchimatsu 2015;Mable et al. 2017). In outcrossing Arabidopsis species (e.g., Arabidopsis lyrata, Arabidopsis halleri, and Arabidopsis arenosa), more than ten different S-haplotypes can segregate in a population (Castric and Vekemans 2004;Castric et al. 2008). This haplotypic diversity is essential for an SI system to function and has been maintained by frequency-dependent balancing selection for over 8 My (Mable et al. 2003;Castric and Vekemans 2004;Mable et al. 2004;Castric et al. 2008;Llaurens et al. 2008;Le Veve et al. 2022). A diploid outcrossing individual can possess two different S-alleles but often only one of them is expressed due to dominance, thus increasing the chances of reproduction (Hatakeyama et al. 2001;Kusaba et al. 2002;Prigoda et al. 2005;Okamoto et al. 2007), although codominance has also been reported (Prigoda et al. 2005;Llaurens et al. 2008). The expression of only one S-allele increases the chances for successful mating in heterozygous outcrossers, however, which of the S-alleles will be expressed can differ in pollen and stigma (Bateman 1954). Pollen-driven dominance is more thoroughly described and is conditioned by different trans-acting microRNA precursors and their targets on recessive S-alleles. MicroRNAs produced by dominant S-alleles silence the expression of the SCR gene on recessive S-allele through methylation of a 5′ promoter of SCR (Kusaba et al. 2002;Shiba et al. 2006); (Tarutani et al. 2010;Durand et al. 2014;Fujii and Takayama 2018). As dominance is uncoupled from self-recognition in this system, a dominant loss-of-function mutation is possible and would yield a self-compatible phenotype in a heterozygous individual.
The ancestral state in the genus Arabidopsis is outcrossing due to self-incompatibility. However, self-compatible species have evolved multiple times: in the model species Arabidopsis thaliana, and allotetraploids Arabidopsis suecica and Arabidopsis kamchatica. One of the early challenges for a new polyploid is the scarcity of compatible karyotypes for mating, and competition with established nearby diploids (Levin 1975). Selfing alleviates such challenges. In A. suecica, the transition to self-compatibility was likely immediate following the cross between an A. thaliana with a nonfunctional dominant S-haplotype (Tsuchimatsu et al. 2010) and an outcrossing A. arenosa (Novikova et al. 2017). However, the origin of self-compatibility in A. kamchatica is less clear, as the species originated from multiple crosses between A. lyrata and A. halleri in East Asia (Shimizu et al. 2005;Shimizu-Inatsugi et al. 2009;Tsuchimatsu et al. 2012;Paape et al. 2018). Whereas A. halleri is an obligate outcrosser, A. lyrata is predominantly self-incompatible with described self-compatible populations restricted to the Great Lakes region of North America (Mable et (Willi et al. 2022). A selfing individual of A. lyrata collected in Yakutia has been reported as genetically closest to the A. lyrata subgenome of A. kamchatica (Shimizu-Inatsugi et al. 2009;Paape et al. 2018), but the evolutionary history of this selfing lineage and S-locus genotype has not been described.
Here, we ask 1) how and when self-compatibility evolved and spread in Siberian A. lyrata; 2) is it plausible that A. lyrata was already self-compatible when it contributed to allopolyploid A. kamchatica? and 3) could a loss of self-incompatibility in only one of the diploid ancestors (A. lyrata) be sufficient to render A. kamchatica selfcompatible? Broad sampling combining live and herbarium collections allowed us to describe the selfing lineage of A. lyrata in Siberia ranging between Lake Taymyr and Chukotka, across north-central and eastern Russia. We first present chromosome-level assemblies of a Siberian selfing A. lyrata and the reference North American selfing accession (Hu et al. 2011), characterize the genomic and structural differences between them, and describe the S-locus structure and the likely mechanism of the failure of selfincompatibility in the Siberian selfing populations. Using demographic modeling, we date the transition to selfing in Siberian A. lyrata and suggest that it happened prior or concurrent with the formation of allopolyploid A. kamchatica. We confirm that the Siberian selfing A. lyrata was likely one of the progenitors of the allotetraploid A. kamchatica using overall genetic relatedness assessment and the phylogeny of the SRK gene in the S-locus. Together, our results suggest that one of the allopolyploid A. kamchatica origins and its transition to selfing was facilitated by the loss-of-function in the dominant S-allele inherited from Siberian A. lyrata.

Genome Assembly of the Selfing Siberian NT1 Accession
We grew seeds of A. lyrata collected from three populations in Yakutia (supplementary table S1 suggesting self-compatibility. We observed that flowers of the selfing NT1 accession appeared to be smaller compared with flowers of outcrossing plants and another selfing accession MN47 from North America (supplementary fig. S1C-E, Supplementary Material online). We confirmed that selfpollen successfully germinated in a selfed NT1 accession and made pollen tubes, whereas self-pollination of an outcrossing plant from the NT8 population did not result in pollen tube growth (supplementary fig. S2, Supplementary Material online).
Various papers (Long et al. 2013;Slotte et al. 2013;Henry et al. 2014;Burns et al. 2021;Dukić and Bomblies 2022) have reported potential artifacts in the reference A. lyrata MN47 (version 1 or v1) genome assembly (Hu et al. 2011 (Long et al. 2013;Slotte et al. 2013;Burns et al. 2021;Dukić and Bomblies 2022). We confirmed the existence of such artifacts and corrected them through long-read DNA sequencing (supplementary table S2, Supplementary Material online). Specifically, we obtained 868,563 HiFi reads of the MN47 accession with N50 length of 20,206 bp (total length of raw read sequences is ∼17,6 Gbp; ∼80× coverage). In total, we assembled ∼244 Mb in 820 contigs with an N50 of 23.506 Mb, indicating that full chromosome arms of MN47 were assembled as single contigs. Contigs were scaffolded into eight chromosomes using the genomes of MN47 v1 and NT1 as guides. The scaffolded contigs amount to ∼209 Mb. Completeness of the new MN47 v2 A. lyrata genome assembly by BUSCO was 4,544 complete and single-copy (97.1%), 83 complete and duplicated (1.8%), 8 fragmented (0.2%), and 44 missing genes (0.9%) from the Brassicales_odb10 set. The placement and orientation of contigs in the scaffolds were corrected using previously published Hi-C data   Transition to Self-compatibility Associated with Dominant S-allele · https://doi.org/10.1093/molbev/msad122 MBE (NORs) of A. lyrata is located at the end of chromosome 3 (Lysak et al. 2006). We confirmed that chromosome 3 contains a partially assembled NOR using Basic Local Alignment Search Tool (Blast). Overall, we have assembled high-quality chromosome-level genomes for two A. lyrata accessions and through pairwise genome alignment, we identified several inversions up to 2.4 Mb long segregating in the species.
Breakdown of the SI System in Siberian A. lyrata NT1 Both genes flanking the S-locus (U-box and ARK3) were assembled in a single contig in the HiFi assembly before any scaffolding, indicating that the entire ∼44.5 kb S-locus of the NT1 accession was fully assembled. We further confirmed the completeness of the S-locus by mapping PacBio reads back to the assembly and found even coverage spanning the S-locus with no gaps (supplementary fig.  S8, Supplementary Material online). Blast analysis of SRK and SCR sequences from the known S-haplotypes (supplementary Data 1 and 2, Supplementary Material online) (Boggs et al. 2009a, b;Tsuchimatsu et al. 2010;Guo et al. 2011;Goubet et al. 2012;Tsuchimatsu et al. 2012) revealed no hits for SRK, and one hit for SCR from the A. halleri S12 haplogroup ( fig. 2B). Due to long-term frequency-dependent balancing selection on the S-locus in Brassicaceae, relatedness among S-haplotypes is not consistent with species relatedness, such that the closest sequences to A. halleri S12 (AhS12) are not other A. halleri S-haplotypes but rather specific S-haplotypes from A. lyrata S42 (AlS42) and A. kamchatica D (Ak-D) (Wright 1939;Vekemans and Slatkin 1994;Mable et al. 2003;Castric and Vekemans 2004;Kamau and Charlesworth 2005;Castric et al. 2008;Llaurens et al. 2008;Tsuchimatsu et al. 2012;Roux et al. 2013). We estimated a phylogeny of the known SCR protein sequences (Guo et al. 2011;Goubet et al. 2012) and the manually annotated NT1 A. lyrata SCR sequence from the Blast results ( fig. 2A). As expected, the SCR phylogeny has a different topology than the species phylogeny, as S-haplotypes are trans-specifically shared across Arabidopsis. The SCR phylogeny confirms that the closest haplotype to the NT1 A. lyrata S-locus is the S12 haplotype from A. halleri (AhS12).
We compared the structures of the AhS12 and NT1 S-loci ( fig. 2B) and confirmed the absence of SRK (i.e., the female component of the self-incompatibility system), which is sufficient to explain the selfing nature of the NT1 accession. We also mapped short reads from NT1 to the NT1 genome assembly plus the intact AhS12 sequence from A. halleri containing SRK, and found no reads mapped to SRK (supplementary fig. S9C, Supplementary Material online). This provides additional confirmation of a complete loss of SRK from the NT1 S-locus. Analyzing the SCR protein sequences more closely, we also observed a loss of one of the eight conserved cysteines in the NT1 SCR sequence, which are important in proteinfolding and the recognition of the SCR ligand by the SRK receptor (Kusaba et al. 2001;Mishima et al. 2003;Tsuchimatsu et al. 2010) (supplementary fig. S10A, Supplementary Material online). This suggests that the SCR protein is nonfunctional in the NT1 A. lyrata accession. We tested for expression of the SCR gene in the Comparison of the S-locus region of the A. lyrata NT1 genome assembly with the Arabidopsis halleri S12 haplotype (Durand et al. 2014). Links between S-loci are colored according to the Blast scores from highest (blue) to lowest (gray). SCR, SRK, and flanking U-box and ARK3 genes have green, orange, and purple borders, respectively. SRK gene appears to be completely absent from the S-locus of the NT1 A. lyrata selfing accession. The only Blast hit to SRK is a spurious hit to ARK3 as they both encode receptor-like serine/threonine kinases. (C) Protein sequence alignment of S-locus SCR genes from A. halleri and A. lyrata, including NT1. One of the eight conserved cysteines important for structural integrity has been lost from the NT1 SCR protein.
Kolesnikova et al. · https://doi.org/10.1093/molbev/msad122 MBE flowers of NT1 using RNAseq and did not detect any transcript of the AhS12 SCR (supplementary fig. S9A and B, Supplementary Material online), though this may be due to the timing of floral development as expression of SCR is transient (Burghgraeve et al. 2020). Sequence comparison of the SCR region between AhS12 and NT1 showed high similarity in the promoter region (supplementary fig. S9D, Supplementary Material online) indicating that structural rearrangements did not cause loss of expression-but nucleotide substitutions at critical sites cannot be excluded. To verify whether SCR is indeed nonfunctional and/or not expressed in NT1, we performed controlled crosses, fertilizing an outcrossing A. lyrata accession (NT8.4-24, which has a functional AhS12 haplogroup) with NT1 pollen, resulting in successful pollen tube growth (supplementary fig. S11, Supplementary Material online). This outcome is possible if 1) the SCR protein from the NT1 accession could not be recognized by SRK receptors from the same AhS12 haplogroup or 2) the SCR gene was not expressed at all. Both of these scenarios lead to the conclusion that the SCR gene is nonfunctional in the NT1 selfing Siberian A. lyrata accession. There is, however unlikely, an additional possibility: 3) A self-compatible reaction could be possible with a functional SCR in NT1 if the SRK gene from haplogroup AhS12 was not expressed in the outcrossing maternal plant (NT8.4-24). We describe scenario 3 as improbable because outcrossing maternal plant NT8.4-24 is heterozygous at the S-locus, possessing two S-alleles: AhS12 and AlS25. The latter is known to be either codominant or recessive to AhS12 as it belongs to a lower dominance class (Llaurens et al. 2008;Durand et al. 2014), therefore, AhSRK12 is most likely expressed in NT8.4-24.
According to the classification of S-haplotypes, AhS12 belongs to dominance class IV (the most dominant class), and it is documented that it has an sRNA precursor, which can silence the expression of SCR genes from S-haplotypes belonging to classes I, II, and III (Durand et al. 2014;Burghgraeve et al. 2020). Indeed, by Blast analysis, we identified an sRNA precursor sequence in the NT1 S-locus assembly similar to the mirS3 precursor of A. halleri S12 haplotype (Durand et al. 2014), suggesting a conserved dominance mechanism of A. lyrata S12.  fig. 3). This heterozygositybased assignment is supported by our observations of individuals growing in the greenhouse: NT1 populations produced seeds without crosses, whereas NT8 and NT12 populations did not. Allotetraploid A. kamchatica cooccurring in the same geographical region is also selfcompatible. To ensure that none of our A. lyrata samples were misclassified, we first mapped allotetraploid A. kamchatica samples in the same way to the NT1 A. lyrata reference without separating subgenomes. The majority of the single nucleotide polymorphisms (SNPs) in A. kamchatica represent divergent sites between the two subgenomes, which explains its high heterozygosity levels, clearly distinct from selfing A. lyrata samples (supplementary figs. S12, Supplementary Material online and 3).

Genotyping S-alleles in Outcrossers
We genotyped S-alleles of all the short-read sequenced accessions in our data set by using a genotyping pipeline for de novo discovery of divergent alleles  with both SCR and SRK sequences as the reference allele databases (Schierup et al. 2001;Mable et al. 2003;Bechsgaard et al. 2004;Castric and Vekemans 2007;Castric et al. 2008;Boggs et al. 2009a, b;Guo et al. 2009;Castric et al. 2010;Guo et al. 2011;Goubet et al. 2012;Dwyer et al. 2013;Durand et al. 2014;Mable et al. 2017;Tsuchimatsu et al. 2017;Mable et al. 2018;Takou et al. 2020;Kodera et al. 2021) (supplementary Data 1 and 2, Supplementary Material online and supplementary table S1, Supplementary Material online). For each outcrossing individual, we find two different SRK alleles and at most one SCR allele ( fig. 3C). Identifying SCR alleles is more difficult than SRK, likely due to an incomplete SCR database rather than these genes being absent in outcrossing individuals. Transition to Self-compatibility Associated with Dominant S-allele · https://doi.org/10.1093/molbev/msad122  This may be because the founder was a heterozygous outcrosser, and a certain amount of gene flow does occur between lineages, as self-compatibility does not prevent plants from mating with outcrossers.
To further investigate the relationships between selfing and outcrossing populations and to date the selfincompatibility breakdown, we implemented a series of demographic models in fastsimcoal26 (Excoffier et al. 2013). The best-fit model is shown ( fig. 3D, table 1), which includes divergence between selfers and outcrossers with a subsequent bottleneck in the selfing lineage, with asymmetric introgression between populations. The estimate of divergence time (T DIV ) in this model is ∼90 ka (87,756), though we suggest caution when interpreting such estimates. All tested models can be viewed in supplementary figure S15, Supplementary Material online, with corresponding parameters in supplementary table S4,  (Durand et al. 2014), and AhS63 belongs to class III of dominance (corresponding to AlS41 in Mable et al. 2018), which is expected to be recessive to the class IV AhS12 allele in A. lyrata (Prigoda et al. 2005 Paape et al. 2018). This is also apparent in the strong population structure of the S-allele combinations inherited from different parental lineages ( fig. 4C). The most common S-allele in A. kamchatica on the A. lyrata subgenome is AhS12 (AkS-D), which is also fixed in the selfcompatible Siberian A. lyrata lineage. Moreover, F1 crosses ( fig. 4A and B) show that the pollendominance mechanism is retained in self-compatible Siberian We, therefore, hypothesize that A. kamchatica with AhS1 (AkS-C)/AhS12 (AkS-D) combination of S-alleles was selfcompatible in the first generation due to dominance of the AhS12 S-allele inherited from self-compatible Siberian A. lyrata over AhS1 inherited from A. halleri.

Discussion
Full A. lyrata Genomes Selfing accessions can be considered natural inbred lines, which are especially useful in genomics, as the assembly of their genomes is not complicated by long heterozygous stretches. So far, only one selfing accession (MN47) of A. lyrata from North America has been fully assembled and serves as a reference for this species (Hu et al. 2011). An additional draft assembly of A. lyrata subsp. petraea has also been released (Paape et al. 2018), though its utility is hindered due to gaps in the assembly (12.75% missing) and lack of contiguity (scaffold N50 of 1.2 Mb). Furthermore, whereas a single reference genome provides a useful resource for short-read re-sequencing-based population genetic studies (Novikova et al. 2016; The 1001 Genomes Consortium 2016), reference bias is an increasingly recognized problem. Using long and proximityligation reads we assembled high-quality genomes of the Siberian selfing A. lyrata accession NT1 and reassembled North American A. lyrata MN47 accession. We found five inversions ranging from 0.3 to 2.4 Mb in length in between these independently evolved selfing accessions ( fig. 1 and supplementary table S3, Supplementary Material online). Large genomic structural rearrangements, especially inversions, can prevent chromosomal pairing and drive reproductive isolation and speciation (Rieseberg 2001;Stevison et al. 2011;McGaugh and Noor 2012;Ayala et al. 2013;Jeffares et al. 2017). In these circumstances, selfing probably increases tolerance to such rearrangements and can even promote their fixation. For example, karyotypic changes from 8 to 5 chromosomes in A. thaliana are linked to a transition to self-compatibility at about 500 Kya (Durvasula et al. 2017). A. lyrata transitions to selfing are more recent but are consistent with this observation. Interestingly, the inversions found within A. thaliana (Jiao and Schneeberger 2020; Goel and Schneeberger 2022) and within A. lyrata (this study) are comparable in size: up to Table 2. The Genotypes and Phenotypes of Outcrossing Mother Plants (TE10.3-2 and TE11.1-2) and F1 Progeny From Their Pollination by NT1 Self-compatible A. lyrata Accession With AhS12 S-allele (SCR Present and SRK Lost- fig. 2B). Mating types are abbreviated with SC for self-compatibility and SI for self-incompatibility.  fig. S1C-E, Supplementary Material online). The lack of so-called "selfing syndrome" in the latter  4. (A) Self-pollinated F1 progeny (F1.1-1) resulting from a cross between a self-incompatible (shown in B) ♀ TE10.3-2 Arabidopsis lyrata accession and ♂ NT1 self-compatible A. lyrata accession shows pollen tube growth (yellow arrow) and dominance of self-compatibility in the F1 generation. (B) Self-pollinated self-incompatible A. lyrata accession TE10.3-2 (used as the maternal plant in A shows no pollen tube growth, demonstrating its self-incompatibility. (C ) The geographical distribution of Arabidopsis kamchatica S-haplotypes shows a strong population structure across the species range. Circles are individual accessions, with S-haplogroups indicated by colors of pie slices. Arabidopsis halleri orthologous S-haplogroups are mentioned in the parenthesis next to the A. kamchatica S-haplogroups (AkS-A-E). Circle outline indicates either previously published data (grey) or newly reported accessions (black). A. kamchatica occurrences from the Global Biodiversity Information Facility (GBIF) are indicated by transparent grey dots. Transition to Self-compatibility Associated with Dominant S-allele · https://doi.org/10.1093/molbev/msad122 MBE was described previously (Carleial et al. 2017). Similarly, in the outcrossing species Leavenworthia alabamica, two independent selfing lineages have been described, with the older (∼150 Kya) showing an obvious selfing syndrome whereas the younger selfing lineage (∼48 Kya) did not (Busch et al. 2011). Although further investigation is required to quantify this difference in flower size, such observations in A. lyrata may also be explained by differences in transition to selfing: The North American A. lyrata likely transitioned to selfing during or after colonization of the area, around ∼10 Kya (Carleial et al. 2017), which is much more recent than our estimates of the Siberian selfer originating ∼90 Kya. Transition to Self-compatibility in Siberian A. lyrata Is Associated with S-locus All selfing Siberian accessions spanning the massive geographical area between Lake Taymyr and Chukotka share the same S-haplotype (AhS12), suggesting the breakdown of self-incompatibility in Siberia is linked to the S-locus. This also suggests a single breakdown of selfincompatibility in the Siberian selfing lineage, as it is unlikely that this transition to self-compatibility occurred independently in multiple individuals with the same AhS12 allele. More than one origin of self-compatibility in the studied Siberian populations is improbable for two reasons: First, the S-locus is highly diverse, as tens of divergent S-alleles typically segregate within outcrossing populations to facilitate reproductive success (Schierup et al. 2001;Castric and Vekemans 2004;Castric and Vekemans 2007), so one would expect more diversity of S-alleles if there were multiple origins; and second, because dominant alleles, including AhS12, are more rare compared with recessive ones (Schierup et al. 1997;Billiard et al. 2007;Genete et al. 2020). The probability of independent loss-of-function on two of the same rare alleles is low (multiplied probabilities of drawing the same rare allele by chance). However, the breakdown of self-incompatibility in North American A. lyrata is not associated with a specific S-allele or S-locus mutation, but rather with another genetic factor, likely in a downstream cascade of reactions preventing pollen-tube growth Foxe et al. 2010;Mable et al. 2017). Therefore, despite strong evidence supporting an S-locus-driven loss of selfincompatibility in Siberian A. lyrata, it is possible that a mutation in a downstream cascade caused the initial mating system switch (Goring et al. 2014;Jany et al. 2019), followed by fixation of a single S-allele due to drift, and further degeneration of the S-allele sequence, reinforcing self-compatibility. Another scenario could involve a modifier mutation specific to AhS12 S-allele, which arose prior to loss-of-function in the S-locus. The existence of allelespecific modifiers has been proposed based on observed segregation patterns in offspring (Nasrallah et al. 2004 Whereas these alternative explanations are plausible, based on the strong association between a specific S-haplotype (AhS12) and self-compatibility, we conclude that inactivation of AhS12 is the most likely scenario.

Self-compatibility in Siberian A. lyrata Is Likely Male-driven
Our long-read-based genome assembly of A. lyrata NT1 contains a fully-assembled S-locus ( fig. 2), in which we manually annotated SCR by Blast analysis of all known SCR sequences in Arabidopsis. The SRK gene was absent from our assembly. Mapping of the short reads from the A. lyrata NT1 accession to A. halleri AhS12 sequence of the same haplotype also did not yield any coverage of the SRK gene, so we conclude that SRK was lost from the NT1 genome. However, this does not mean that the loss of SRK is the causal mutation leading to selfing, as the SCR protein of NT1 A. lyrata also appears to be nonfunctional: 1) it lacks one of the eight cysteine residues ( fig. 2C) that were shown to be functionally important (Kusaba et al. 2001;Mishima et al. 2003;Tsuchimatsu et al. 2010) (fig. 2C), and 2) its expression was not detected in flowers (supplementary fig. S9, Supplementary Material online). Genotyping of the S-locus in other selfing A. lyrata accessions reveals that all of them share the same S-haplotype AhS12 ( fig. 3C and supplementary table S1, Supplementary Material online), which suggests their shared origin. Moreover, one of the selfing A. lyrata accessions has SRK, but seems to lack SCR (accession number MW0079456, fig. 3). Different reciprocal gene loss mutations of SCR or SRK across accessions ( fig. 3B) exclude the possibility of gene loss being a causal mutation and rather suggest that gene loss happened after a common causal mutation.
In controlled crossing experiments (Tsuchimatsu et al. 2012), haplogroup-D SRK in the A. lyrata subgenome of A. kamchatica (AkSRK-D, orthologous to AhSRK12) was shown to be functional. This suggests that SRK in the ancestors of both A. kamchatica and selfing A. lyrata was also functional. We discuss the role of selfing A. lyrata in the origin of A. kamchatica in the next section. If the breakdown of self-incompatibility is indeed S-locus driven (and not caused by an unlinked S-allele specific modifier), it most likely occurred on SCR rather than SRK in this lineage. Our results show that indeed, SCR from NT1 is not recognized by a functional SRK of the same haplogroup (from accession NT8.4-24; supplementary fig. S11, Supplementary Material online). Whether the initial loss-of-function in the SCR protein was due to a loss of a structurally important cysteine residue ( fig. 2C) or a loss of expression (supplementary fig. S9A and B, Supplementary Material online) is unclear. Transitioning to selfing through degradation of male specificity gene would be consistent with the recurrent pattern in the evolution of self-compatibility (reviewed in Shimizu and Tsuchimatsu (2015)). According to Bateman's principle, Kolesnikova et al. · https://doi.org/10.1093/molbev/msad122 MBE an S-haplotype with nonfunctional SCR and functional SRK will produce pollen of higher fitness, as it will be compatible with all other S-haplotypes including itself. In contrast, an S-haplotype with a functional SCR and a nonfunctional SRK will produce pollen that will be selfcompatible but incompatible with the fraction of the population carrying the same, albeit fully functional, S-haplotype. Pistils with a nonfunctional SRK do not have a higher fitness unless pollen availability is very limited, making fixation of the male-driven selfing more likely (Bateman 1954;Tsuchimatsu and Shimizu 2013). The most likely scenario suggested by our results, where selfcompatibility in Siberian A. lyrata is SCR-driven, is therefore consistent with Bateman's principle.

Self-compatible Siberian A. lyrata Is Ancestral to A. kamchatica
A previous study showed a Siberian A. lyrata accession (lyr-pet4) to be genetically closest to A. kamchatica, however this was limited to sampling in a single locality and did not include assessment of S-alleles (Shimizu-Inatsugi et al. 2009;Paape et al. 2018). In addition to the previously reported selfing individual, our field and herbarium collections yielded seven more self-compatible accessions, spanning a wide geographical range across Siberia (supplementary table S1, Supplementary Material online). We explored the relationships among all Siberian A. lyrata accessions with A. kamchatica using network analysis and hierarchical clustering. Genetic network of Nei's D ( fig. 3B) shows that A. kamchatica clusters closely to selfcompatible Siberian A. lyrata, which is consistent with the sister relationship between A. kamchatica and selfcompatible A. lyrata in a well-supported ML phylogeny (supplementary fig. S13B, Supplementary Material online). Moreover, we identified a fixed S-allele (AhS12) associated with self-compatibility in Siberian A. lyrata.
Allopolyploid A. kamchatica has three S-alleles inherited from A. halleri-AhS26 (AkS-A), AhS47 (AkS-B), and AhS1 (AkS-C) and two S-alleles inherited from A. lyrata -AhS12 (AkS-D) and AhS02 (AkS-E) (Tsuchimatsu et al. 2012). The AhS12 S-allele is the most frequent in the A. lyrata subgenome of A. kamchatica and was inherited from a self-compatible Siberian A. lyrata lineage. A tree of A. lyrata and A. kamchatica accessions, which share the AhS12 haplotype (based on exon 1 of the SRK gene; supplementary fig. S13A, Supplementary Material online), shows that a self-compatible A. lyrata accession is nested within a clade of A. kamchatica accessions, providing further support for their shared origin.
Furthermore, our demographic modeling suggests the Siberian selfing lineage originated approximately 90 Kya. This is in line with estimates by Paape et al. (2018), who dated the divergence times of both A. kamchatica subgenomes. Their estimates for divergence time of the A. halleri subgenome range from ∼60 to 100 Kya and the A. lyrata subgenome between ∼70 Kya and 140 Kya. The authors recommend caution when interpreting these parameters, and we agree that: Mutation rates used in both studies are from A. thaliana rather than A. lyrata, and sample sizes are small in both cases. Still, given the overlap in divergence estimates from both our study and work by Paape et al. (2018), it is plausible that at least one of the multiple polyploid origins of A. kamchatica included this selfing Siberian A. lyrata lineage as a parental genome donor.
Combinations of A. kamchatica S-alleles show a strong population structure ( fig. 4C) consistent with multiple origins of A. kamchatica in different geographical regions (Shimizu et al. 2005;Shimizu-Inatsugi et al. 2009;Tsuchimatsu et al. 2012;Paape et al. 2018). However, the current sampling of A. kamchatica is biased towards Japan and the Kamchatka Peninsula, and this uneven coverage of the species range means that observed frequencies of S-allele combinations may not represent their true distribution. That said, a combination of dominant nonfunctional AhS12 (A. lyrata-derived) and recessive AhS1 (A. halleri-derived) S-alleles is common in A. kamchatica in the eastern Siberian mountains bordering Okhotsk sea in Aldan-Amur interfluve ( fig. 4C).
Interestingly, whereas both progenitors of A. kamchatica coexist in Europe, and interspecific crosses can be created ex situ (Sarret et al. 2009), A. lyrata and A. halleri do not form other allotetraploids (Clauss and Koch 2006;Schmickl et al. 2010). The variation (or lack thereof) of mating systems in A. lyrata and A. halleri can explain why allopolyploid establishment is limited to Asia: A. halleri is self-incompatible throughout its range (no known selfing accessions have been described to date), and selfing A. lyrata is found only in Siberia and North America. Previous work showed that self-compatibility in A. kamchatica was likely male (SCR)-driven in the more dominant S-haplotype inherited from A. lyrata (Ah12/Al42/Ak-D) (Tsuchimatsu et al. 2012). We argue that self-compatibility is ancestral to A. kamchatica, and inherited from Siberian A. lyrata. We also show that dominance between nonfunctional AhS12 and functional AhS01 is retained in selfcompatible A. lyrata ( fig. 4A and B) and therefore argue that the transition to selfing in A. kamchatica with this combination of S-alleles was likely immediate upon allopolyploid formation. Our results show that Siberian selfing diploid A. lyrata is ancestral to allotetraploid A. kamchatica, and contributed the most widely observed A. lyrataderived S-allele (AhS12) in A. kamchatica. Furthermore, the nonfunctional AhS12 S-allele is still dominant over the recessive AhS01 S-allele in A. lyrata. This dominance of the nonfunctional S-allele likely explains the transition to self-compatibility in A. kamchatica with the same combination of S-alleles (AhS12/AkS-D and AhS01/AkS-C), rather than self-compatibility evolving de novo in A.

kamchatica.
Similar examples where a loss-of-function mutation on a dominant S-haplotype in one progenitor facilitated transition to selfing in allotetraploids have been recently reviewed (Novikova et al. 2022) and include A. suecica (Novikova et al. 2017), Capsella bursa-pastoris (Bachmann et al. 2019(Bachmann et al. , 2021Duan et al. 2023), and Transition to Self-compatibility Associated with Dominant S-allele · https://doi.org/10.1093/molbev/msad122 MBE Brassica napus (Okamoto et al. 2007;Kitashiba and Nasrallah 2014). Allopolyploid establishment may be facilitated by a transition to self-compatibility, ensuring reproductive success in the face of limited mating partners.

Plant Collection and Growth
We collected seeds from three A. lyrata populations (NT1, NT8, and NT12) during an expedition to the Yakutia region in Russia in the summer of 2019 (supplementary table S1, Supplementary Material online). Multiple individual plants were collected from those three populations: three individuals from NT1 (NT1_1, NT1_2, and NT1_3), four from NT8 (NT8_1, NT8_2, NT8_3, and NT8_4) and two from NT12 (NT12_1 and NT12_2). Collected seeds were grown in the greenhouse at 21 °C, under 16 h of light per day until a full rosette was formed, after which plants were moved to open frames outside on the grounds of the Max Planck Institute for Plant Breeding Research in Cologne, Germany. We grew several seeds per collected bag of seeds from individual plants, each was given an additional number extension (e.g., NT1_1_1, NT1_1_2, etc.). In this work, we only used the plants with last extension 1. All the individuals grown from NT1 population formed long fruits and appeared to be selfing.

Pollen Tube Staining to Characterize Mating Type
Almost mature flower buds were opened and after removing the anthers, they manually pollinated. Pistils were collected 2-3 h after pollination, fixed for 1.5 h in 10% acetic acid in ethanol, and softened in 1 M NaOH overnight. Before staining, the tissue was washed three times in KPO 4 buffer (pH 7.5). For staining, we submerged the tissue in 0.01% aniline blue for 10-20 min. After that, pistils were transferred to slides into mounting media and observed under UV light (Lu 2011). A self-compatible reaction was called if we counted more than ten pollen tubes.
Long-read Sequencing for de novo Genome Assembly DNA extraction, library preparation, and long-read sequencing of the NT1 and MN47 A. lyrata accessions were performed by the Max Planck-Genome-centre Cologne, Germany (https://mpgc.mpipz.mpg.de/home/). High molecular weight DNA was isolated from 1.5 g material with a NucleoBond HMW DNA kit (Macherey Nagel). Quality was assessed with a FEMTOpulse device (Agilent), and quantity was measured by a Quantus fluorometer (Promega). HiFi libraries were then prepared according to the manual "Procedure & Checklist-Preparing HiFi SMRTbell® Libraries using SMRTbell Express Template Prep Kit 2.0" with an initial DNA fragmentation by g-Tubes (Covaris) and final library size selection on BluePippin (Sage Science). Size distribution was again controlled by FEMTOpulse (Agilent). Size-selected libraries were then sequenced on a Sequel II device with Binding Kit 2.0 and Sequel II Sequencing Kit 2.0 for 30 h (Pacific Biosciences).

Short-read Sequencing for Population Analyses
Plant material was processed in two different ways, indicated by types I and II in supplementary table S1, Supplementary Material online.
Type I: Herbarium material was extracted in a dedicated clean-room facility (Ancient DNA Laboratory, Department of Archaeology, University of Cambridge). The lab has strict entry and surface decontamination protocols, and no nucleic acids are amplified in the lab. For each accession, leaf and/ or stem tissue was placed in a 2 ml tube with two tungsten carbide beads and ground to a fine powder using a Qiagen Tissue Lyser. Each batch of extractions included a negative extraction control (identical but without tissue). DNA was extracted using the DNeasy Plant Mini Kit (Qiagen). Library preparation and sequencing were performed by Novogene Ltd (UK). Sequencing libraries were generated using NEBNext® DNA Library Prep Kit following manufacturer's recommendations, and indices were added to each sample. The genomic DNA is randomly fragmented to a size of 350 bp by shearing, then DNA fragments were end polished, A-tailed, and ligated with the NEBNext adapter for Illumina sequencing, and further enriched by polymerase chain reaction (PCR) on P5 and indexed P7 oligos. The PCR products were purified (AMPure XP system), and resulting libraries were analyzed for size distribution by Agilent 2100 Bioanalyzer and quantified using real-time PCR.
Type II: Genomic DNA was isolated with the "NucleoMag© Plant" kit from Macherey and Nagel (Düren, Germany) on the KingFisher 96Plex device (Thermo) with programs provided by Macherey and Nagel. Random samples were selected for a quality control to ensure intact DNA as a starting point for library preparation. TPase-based libraries were prepared as outlined by (Rowan et al. 2019) on a Sciclone (PerkinElmer) robotic device. Short-read (PE 150 bp) sequencing was performed by Novogene Ltd (UK), using a NovaSeq 6000 S4 flow cell Illumina system.

Transcriptome Sequencing for S-locus Gene Expression Assessment
We used three flash-frozen open flowers of the A. lyrata NT1 accession as input material for RNA sequencing, which we used to assess the expression of the S-locus genes. RNA was extracted by the RNeasy Plant Kit (Qiagen) including an on-column DNase I treatment. Quality was assessed by Agilent Bioanalyser and the amount was calculated by an RNA-specific kit for Quantus (Promega). An Illumina-compatible library was prepared with the NEBNext® Ultra™ II RNA Library Prep Kit for Illumina ® and finally sequenced on a HiSeq 3000 at the Max Planck-Genome-centre Cologne, Germany.
Kolesnikova et al. · https://doi.org/10.1093/molbev/msad122 MBE PacBio de novo Assembly and Annotation of NT1 and MN47 A. lyrata Accessions Raw PacBio reads of NT1 were assembled using Hifiasm assembler (Cheng et al. 2021) in the default mode, choosing the primary contig graph as our resulting assembly. The completeness of our assembly was assessed using BUSCO (Seppey et al. 2019) with Brassicales_odb10 set. Repeated sequences were masked using RepeatMasker (Smit et al. 2013(Smit et al. -2015 with the merged libraries of RepBase A. thaliana repeats and NT1 A. lyrata repeats, which we modeled with RepeatModeler (Smit andHubley 2008-2015). Then, annotation from the reference MN47 genome (Rawat et al. 2015) was transferred to our NT1 repeat-masked assembly by using Liftoff (Shumate and Salzberg 2020). Contigs were reordered according to their alignment to the reference chromosomes and updated gene and repeat annotations using RagTag (Alonge et al. 2019) in the scaffolding mode without correction. Assembly of MN47 PacBio reads was done using the Hifiasm assembler with the same parameters.
Synteny Analysis of A. lyrata, A. suecica, and C. rubella Genomes Synteny analysis was done by performing an all-against-all BlastP search using the coding sequences of both genomes. We used SynMap (Haug-Baltzell et al. 2017), a tool from the online platform CoGe, with the default parameters for DAGChainer. The Quota Align algorithm was used to decide on the syntenic depth, employing the default parameters. Syntenic blocks were not merged. The results were visualized using the R (version 4.1.2) library "circlize" (version 0.4.13), as well as using plotsr (version 0.5.3) (Goel and Schneeberger 2022) for the supplementary figures, Supplementary Material online.

HiC Sequencing of NT1 A. lyrata Accession to Validate Structural Variants
A chromatin-capture library of the NT1 A. lyrata accession was prepared by the Max Planck-Genome-centre Cologne, Germany and was used for validation of the large inversions in whole-genome comparisons. We followed the Dovetail® Omni-C® Kit starting with 0.5 g of fresh weight as input. Libraries were quantified and quality assessed by capillary electrophoresis (Agilent Tapestation) and then sequenced at the Novogene Ltd (UK), using a NovaSeq Illumina system.

Mapping of Hi-C Reads for the A. lyrata Accessions NT1 and MN47
To validate the assembled scaffolds of A. lyrata, we used proximity-ligation short read Hi-C data. For NT1, Hi-C reads were mapped to the repeat-masked NT1 genome assembly, using the mapping pipeline proposed by the manufacturer (https://omni-c.readthedocs.io/en/latest/ index.html). The Dovetail Omni-C processing pipeline is based on BWA (Li and Durbin 2009), pairtools (https:// github.com/mirnylab/pairtools), and Juicertools (Durand et al. 2016). We mapped the Hi-C reads for MN47 (released previously ) to a repeat masked MN47 genome (Hu et al. 2011) and to a repeat masked version of the newly assembled MN47 genome (in this paper) using HiCUP (version 0.6.1) (Wingett et al. 2015). The assemblies were manually examined using Juicebox (Robinson et al. 2018). Plots of the HiC contact matrix were made using the function hicPlotMatrix from HiCExplorer (Wolff et al. 2020) (version 3.7.2).

Validation of Structural Variants Between NT1 and MN47 A. lyrata Accessions
To validate the inversions (supplementary table S2, Supplementary Material online), we used PacBio, Hi-C data, and synteny analysis results. Guided by synteny analyses, we first identified inversion breakpoints. Then, we investigated the long-read map at these regions and either confirmed their contiguity or manually flipped the genomic region, followed by another round or long-read map investigation (supplementary figs. S3-S8, Supplementary Material online). To map the PacBio HiFi reads we used Winnowmap (Jain et al. 2020). As the last step, we analyzed the Hi-C contact maps in the same regions to show that there is no evidence for alternative genome assembly configurations (supplementary figs. S3-S7, Supplementary Material online).

A. lyrata NT1 S-locus Genotyping and Manual Annotation
We manually annotated the S-locus in our initial assembly before the reference-guided reordering and scaffolding. In the transferred annotation resulting from Liftoff (Shumate and Salzberg 2020), we found both of the flanking genes (U-box and ARK3) in the same contig. The final coordinates of the S-locus in the NT1 assembly on scaffold 7 are 9,291,658 bp to 9,336,246 bp. The length of the assembled NT1 A. lyrata S-locus including both flanking genes is about 44.5 Kbp. We mapped PacBio long reads back to the assembled NT1 genome using minimap2 (Li 2018) with default parameters in order to make sure that there are no obvious gaps in coverage or breakpoints (supplementary fig. S8, Supplementary Material online). Similar to Zhang et al. (2019), we blasted the SRK and SCR sequences from all the known S-haplotypes across Arabidopsis and Capsella to the A. lyrata NT1 S-locus, finding a single hit at the SCR gene from the AhS12 haplogroup. We constructed a comparative structure plot of A. lyrata NT1 and A. halleri S12 (GenBank accession KJ772374) S-loci ( fig. 2B) using the R library genoPlotR (Guy et al. 2010). We aligned SCR protein sequences using MAFFT with default parameters and estimated a phylogenetic tree with RaxML (Stamatakis 2014) using the BLOSUM62 substitution model and visualized the alignment ( fig. 2C) using Jalview2 (Waterhouse et al. 2009). The phylogenetic tree was visualized using R package "ape" (Paradis et al. 2004). Transition to Self-compatibility Associated with Dominant S-allele · https://doi.org/10.1093/molbev/msad122 MBE A. lyrata S-allele Genotyping From Short-read Sequencing Data The S-alleles from all the re-sequenced samples used in the  population  analysis  (supplementary  table  S1, Supplementary Material online) and crosses (NT8.4-24, submitted to ENA under ERS12276051) were genotyped using the S-locus genotyping pipeline NGSgenotyp . The list of SRK and SCR alleles used as a reference data set is provided in the supplementary table S3, Supplementary Material online, and the corresponding sequences for SRK and SCR alleles are provided in the supplementary Data 1 and 2, Supplementary Material online. Using the NGSgenotyp pipeline, we could not identify any S-haplotypes for DRR124344 (lyrpet4), for either SRK or SCR databases. However, we found a partial SCR gene sequence matching the AhS12 haplotype by blasting the SCR database to the DRR124344 assembly. We translated the SCR nucleotide sequence and aligned the resulting protein sequence with SCR proteins from other accessions using MAFFT (Katoh and Standley 2013) using default parameters. The resulting alignment shows that SCR from DRR124344 is shorter compared with NT1 or AhS12. To confirm that SCR from DRR124344 belongs to the AhS12 haplotype, we estimated a maximum likelihood tree using IQ-tree web service (http://www.iqtree.org/) with default parameters (supplementary fig. S10B, Supplementary Material online).

Separation of Subgenomes From A. kamchatica Accessions
To isolate the A. lyrata subgenome of A. kamchatica, we used a combined reference, containing A. lyrata NT1 and A. halleri ssp. gemmifera reference genomes (Briskine et al. 2016). We mapped A. kamchatica short reads to the combined reference with bwa mem (0.7.17)  and filtered for reads mapped uniquely to A. lyrata NT1 using samtools ). We then genotyped the resulting A. lyrata-subgenome bam files for each A. kamchatica accession as described above for diploid samples.

Tree and Network Estimation
Genome-wide SNP Tree We filtered the vcf generated above to include only biallelic SNPs without missing data, which resulted in 2,261,679 SNPs. These data were read into R (version 4.1.1) and from them, we estimated a neighbor-joining tree using the nj function from package ape (Paradis and Schliep 2019). We then visualized the neighbor-joining tree as a cladogram using ggtree (Yu et al. 2017(Yu et al. , 2018Yu 2020) and annotated the tips with associated data (supplementary fig. S13B, Supplementary Material online). We then further filtered this data set to include only Siberian A. lyrata and an outgroup (excluding A. kamchatica from this portion) to generate the lyrata-only tree ( fig. 3C).

Network Based on Nei's D and Phylogenetic Inference
We filtered a vcf of biallelic SNPs shared by the lyrata subgenome of A. kamchatica and all A. lyrata accessions down to just four-fold degenerate sites, with maximum 10% missing data across individuals, resulting in 4,141 SNPs. We read the vcf with both Siberian A. lyrata and A. kamchatica into R using vcfR (Knaus and Grünwald 2017), then calculated Nei's D (Nei 1972) between individuals using StAMPP (Pembleton et al. 2013). We visualized the resulting matrix in SplitsTree4 (Huson and Bryant 2006) and in R using the pheatmap package (Kolde 2019). To further explore the evolutionary relationships among accessions, we generated a nexus file from the vcf using vcf2phylip (Ortiz 2019), which served as input for phylogenetic inference with IQTree (http://www.iqtree.org/)

SRK Tree
We assembled partial SRK sequences from Siberian A. lyrata and A. kamchatica accessions based on short-read sequencing data using the assembly step of the S-locus genotyping pipeline NGSgenotyp ) and aligned sequences with MAFFT (Katoh and Standley 2013). From this alignment estimated 1,000 bootstrap replicates of a ML phylogeny using RaXML (Stamatakis 2014) with substitution model GTR+Γ then visualized the best-scoring ML phylogeny using R package ape 5.0 (Paradis and Schliep 2019). The input alignment is available in supplementary Data 3, Supplementary Material online.

PCR Identification of AhS12 Haplotype
For DNA extraction, 1 cm of leaf material was frozen in liquid nitrogen and ground to a powder. We added 400 μl UltraFastPrep Buffer to the powdered tissue, then mixed, vortexed, and finally spun for 5 min at 5000 revolutions per minute (rpm). We then took 300 μl of the supernatant, added 300 μl isopropanol, and mixed by inversion. We again spun for 5 min at 5000 rpm, then discarded the supernatant and dried 10-30 min at 37 °C. The pellet was resuspended in 200 μl 1xTE and stored at 4 °C. We amplified the AhSRK12 allele by PCR using 1.5 μl of DNA solution and previously published primers (forward ATCATGGCAGTGGAACAC AG, reverse CAAATCAGACAACCCGACCC) (Ruggiero et al. 2008). We ran 35 cycles consisting of 30 s at 94 °C, 30 s annealing at 56.8 °C, and 40 s extension at 72 °C. We visualized PCR products via gel electrophoresis using 1.5% agarose gel with GelGreen® nucleic acid stain (

Demographic Modeling of Divergence Between Selfing and Outcrossing Siberian A. lyrata Lineages
We calculated nucleotide diversity using all biallelic and non-variant sites in 10 kb windows with custom script uploaded to github (https://github.com/novikovalab/selfing_ Alyrata). CIs for the median of the distribution were calculated using the basic bootstrap method in the R package "boot" (Davison and Hinkley 1997;Canty and Ripley 2022).
To prepare a joint allele frequency spectrum of the seven self-compatible accessions and the ten selfincompatible accessions, we first filtered the SNP-only vcf to remove centromeric, pericentromeric, and exonic regions. We subsequently filtered out sites with missing data to yield our final vcf for demographic inference. Following Nordborg and Donnelly (1997), we excluded sites heterozygous in the selfing population and treated selfers as haploid. We then generated the joint allele frequency spectrum using easySFS (https://github.com/ isaacovercast/easySFS). EasySFS produces output ready for use in fastsimcoal2 (fsc26) (Excoffier et al. 2013(Excoffier et al. , 2021, which we then used for demographic modeling. We tested five models for the origin of self-compatibility in Siberian A. lyrata as follows: 1) simple divergence, 2) divergence with symmetrical introgression (migration), 3) divergence with asymmetrical introgression, 4) simple divergence model as in Model 1 plus bottleneck in selfing population; and 5) Model 3 (asymmetric gene flow) plus bottleneck in selfing population.
For each model, we initiated 100 fastsimcoal2 runs. We then chose the best run for each model (the run with the best likelihood scores) and from that best run, we calculated the Aikake Information Criterion for the model. After selecting the model with the best AIC score, we used the maximum likelihood parameter file to generate 200 pseudo-observations of joint SFS for bootstrapping. For each of the 200 pseudo-observations, we initiated 100 fastsimcoal2 runs, then selected the best run for each model based on likelihood scores as above. The resulting parameter estimates from the 200 replicate pseudo-observations were used to calculate the 95% CIs in R. Site frequency spectra and other fastsimcoal2 input files (.tpl and .est) are on GitHub (https://github.com/ novikovalab/selfing_Alyrata). Because fastsimcoal2 reports haploid effective population sizes, we divided them by two to report numbers of diploid individuals (table 1). These parameters can be interpreted as the inverse of the coalescent rate estimated from our accessions.

Supplementary Material
Supplementary data are available at Molecular Biology and Evolution online.

Data Availability
The whole genome raw Illumina short reads for the samples used in this study were submitted to the ENA database under the project number PRJEB50329 (ERP134897). Individual accession names are listed in the supplementary table S1, Supplementary Material online. Raw PacBio HiFi reads of NT1 and MN47, Hi-C reads of NT1, RNAseq reads of NT1, and the genome assembly and annotation of A. lyrata NT1 (GCA_945152055) and MN47 (GCA_944990045) have been submitted to ENA Transition to Self-compatibility Associated with Dominant S-allele · https://doi.org/10.1093/molbev/msad122 MBE database under the same project number PRJEB50329 (ERP134897) and to https://figshare.com/projects/ Arabidopsis_lyrata_genome_assemblies/162343. Scripts associated with the project are at https://github.com/ novikovalab/selfing_Alyrata.